Overview

Dataset statistics

Number of variables9
Number of observations7385
Missing cells0
Missing cells (%)0.0%
Duplicate rows1168
Duplicate rows (%)15.8%
Total size in memory2.4 MiB
Average record size in memory343.2 B

Variable types

Categorical5
Text1
Numeric3

Variable descriptions

MakeMake is a categorical variable representing the vehicle manufacturer. It contains 42 distinct categories with no missing values. The distribution is moderately imbalanced: the most frequent brands are Ford, Chevrolet, and BMW, while a long tail of less frequent manufacturers accounts for about 46% of the observations. This suggests high categorical diversity, and grouping rare categories or applying appropriate encoding strategies may be necessary in downstream modeling.
ModelThe variable Model is a high-cardinality text feature (2053 distinct values, 27.8%) with a highly skewed distribution. It combines heterogeneous information such as drivetrain, body type, and trim level, making it unsuitable for direct categorical encoding. Therefore, semantic feature extraction (e.g., drivetrain type, body style, performance indicators) is preferred over direct usage or one-hot encoding.
Vehicle ClassVehicle Class is a low-cardinality categorical feature with 16 distinct categories and no missing values. The distribution is reasonably balanced across major vehicle segments (SUV, compact, mid-size, full-size).
Engine Size(L)Engine Size (L) is a continuous variable ranging from 0.9 to 8.4 liters. The distribution is heavily concentrated at 2.0 L
CylindresCylinders is treated as a categorical variable with eight discrete levels, dominated by 4-, 6-, and 8-cylinder engines.
TransmissionTransmission is a categorical variable with 27 distinct levels, combining transmission type and number of gears. Automatic transmissions with 6 to 8 gears dominate the dataset.
Fuel TypeFuel Type is a categorical variable with five levels, dominated by regular (X) and premium (Z) gasoline, while alternative fuels (E85, diesel, natural gas) are relatively rare.
Fuel Consumption Comb (L/100 km)The Fuel Consumption Comb (L/100 km) distribution is positively skewed (Skewness: 0.89), with a mean of 10.98 exceeding the median of 10.6. Most vehicles fall within the 8.9–12.6 range (IQR). The bimodal nature of the plot suggests the dataset contains two distinct vehicle classes with different efficiency profiles.
InteractionsCO2 emissions increase with engine size, showing a strong positive association. While the overall trend is approximately linear, the relationship exhibits a change in slope: emissions rise more steeply for smaller engines and tend to increase more gradually for larger engine sizes. The interaction between combined fuel consumption and CO2 emissions reveals a strong positive relationship. The scatter plot shows several nearly parallel linear patterns, indicating that while CO2 emissions increase approximately linearly with fuel consumption, the data consist of distinct subgroups with similar slopes but different intercepts. This suggests that fuel consumption is a primary driver of CO2 emissions, but additional categorical factors—such as fuel type or engine technology—likely influence emission levels.

Alerts

Dataset has 1168 (15.8%) duplicate rowsDuplicates
Engine Size(L) is highly overall correlated with CO2 Emissions(g/km) and 2 other fieldsHigh correlation
Fuel Consumption Comb (L/100 km) is highly overall correlated with CO2 Emissions(g/km) and 1 other fieldsHigh correlation
CO2 Emissions(g/km) is highly overall correlated with Engine Size(L) and 1 other fieldsHigh correlation
Make is highly overall correlated with CylindersHigh correlation
Vehicle Class is highly overall correlated with Make and 6 other fieldsHigh correlation
Cylinders is highly overall correlated with Engine Size(L) and 1 other fieldsHigh correlation
Transmission is highly overall correlated with Make and 6 other fieldsHigh correlation
Fuel Type is highly overall correlated with Make and 3 other fieldsHigh correlation

Reproduction

Analysis started2026-01-09 17:08:50.022665
Analysis finished2026-01-09 17:08:59.080171
Duration9.06 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

Make
Categorical

High correlation 

Make is a categorical variable representing the vehicle manufacturer. It contains 42 distinct categories with no missing values. The distribution is moderately imbalanced: the most frequent brands are Ford, Chevrolet, and BMW, while a long tail of less frequent manufacturers accounts for about 46% of the observations. This suggests high categorical diversity, and grouping rare categories or applying appropriate encoding strategies may be necessary in downstream modeling.

Distinct42
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size455.5 KiB
FORD
628 
CHEVROLET
588 
BMW
527 
MERCEDES-BENZ
 
419
PORSCHE
 
376
Other values (37)
4847 

Length

Max length13
Median length11
Mean length6.1439404
Min length3

Characters and Unicode

Total characters45373
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACURA
2nd rowACURA
3rd rowACURA
4th rowACURA
5th rowACURA

Common Values

ValueCountFrequency (%)
FORD628
 
8.5%
CHEVROLET588
 
8.0%
BMW527
 
7.1%
MERCEDES-BENZ419
 
5.7%
PORSCHE376
 
5.1%
TOYOTA330
 
4.5%
GMC328
 
4.4%
AUDI286
 
3.9%
NISSAN259
 
3.5%
JEEP251
 
3.4%
Other values (32)3393
45.9%

Length

2026-01-09T18:08:59.182095image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ford628
 
8.3%
chevrolet588
 
7.8%
bmw527
 
7.0%
mercedes-benz419
 
5.6%
porsche376
 
5.0%
toyota330
 
4.4%
gmc328
 
4.3%
audi286
 
3.8%
nissan259
 
3.4%
jeep251
 
3.3%
Other values (35)3555
47.1%

Most occurring characters

ValueCountFrequency (%)
E4807
 
10.6%
O3608
 
8.0%
A3589
 
7.9%
R3114
 
6.9%
I2781
 
6.1%
D2672
 
5.9%
N2483
 
5.5%
C2458
 
5.4%
S2345
 
5.2%
M2036
 
4.5%
Other values (17)15480
34.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)45373
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E4807
 
10.6%
O3608
 
8.0%
A3589
 
7.9%
R3114
 
6.9%
I2781
 
6.1%
D2672
 
5.9%
N2483
 
5.5%
C2458
 
5.4%
S2345
 
5.2%
M2036
 
4.5%
Other values (17)15480
34.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)45373
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E4807
 
10.6%
O3608
 
8.0%
A3589
 
7.9%
R3114
 
6.9%
I2781
 
6.1%
D2672
 
5.9%
N2483
 
5.5%
C2458
 
5.4%
S2345
 
5.2%
M2036
 
4.5%
Other values (17)15480
34.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)45373
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E4807
 
10.6%
O3608
 
8.0%
A3589
 
7.9%
R3114
 
6.9%
I2781
 
6.1%
D2672
 
5.9%
N2483
 
5.5%
C2458
 
5.4%
S2345
 
5.2%
M2036
 
4.5%
Other values (17)15480
34.1%

Model
Text

The variable Model is a high-cardinality text feature (2053 distinct values, 27.8%) with a highly skewed distribution. It combines heterogeneous information such as drivetrain, body type, and trim level, making it unsuitable for direct categorical encoding. Therefore, semantic feature extraction (e.g., drivetrain type, body style, performance indicators) is preferred over direct usage or one-hot encoding.

Distinct2053
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Memory size496.5 KiB
2026-01-09T18:08:59.377810image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length41
Median length32
Mean length11.831957
Min length2

Characters and Unicode

Total characters87379
Distinct characters69
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique502 ?
Unique (%)6.8%

Sample

1st rowILX
2nd rowILX
3rd rowILX HYBRID
4th rowMDX 4WD
5th rowRDX AWD
ValueCountFrequency (%)
awd1128
 
6.8%
ffv592
 
3.6%
4wd477
 
2.9%
coupe375
 
2.3%
4x4333
 
2.0%
s326
 
2.0%
4matic239
 
1.4%
cabriolet221
 
1.3%
xdrive215
 
1.3%
cooper204
 
1.2%
Other values (709)12464
75.2%
2026-01-09T18:08:59.634454image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9200
 
10.5%
A5815
 
6.7%
R4636
 
5.3%
E4496
 
5.1%
C3621
 
4.1%
T3510
 
4.0%
O3450
 
3.9%
D3182
 
3.6%
S3165
 
3.6%
I2352
 
2.7%
Other values (59)43952
50.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)87379
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9200
 
10.5%
A5815
 
6.7%
R4636
 
5.3%
E4496
 
5.1%
C3621
 
4.1%
T3510
 
4.0%
O3450
 
3.9%
D3182
 
3.6%
S3165
 
3.6%
I2352
 
2.7%
Other values (59)43952
50.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)87379
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9200
 
10.5%
A5815
 
6.7%
R4636
 
5.3%
E4496
 
5.1%
C3621
 
4.1%
T3510
 
4.0%
O3450
 
3.9%
D3182
 
3.6%
S3165
 
3.6%
I2352
 
2.7%
Other values (59)43952
50.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)87379
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9200
 
10.5%
A5815
 
6.7%
R4636
 
5.3%
E4496
 
5.1%
C3621
 
4.1%
T3510
 
4.0%
O3450
 
3.9%
D3182
 
3.6%
S3165
 
3.6%
I2352
 
2.7%
Other values (59)43952
50.3%

Vehicle Class
Categorical

High correlation 

Vehicle Class is a low-cardinality categorical feature with 16 distinct categories and no missing values. The distribution is reasonably balanced across major vehicle segments (SUV, compact, mid-size, full-size).

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size494.8 KiB
SUV - SMALL
1217 
MID-SIZE
1133 
COMPACT
1022 
SUV - STANDARD
735 
FULL-SIZE
639 
Other values (11)
2639 

Length

Max length24
Median length21
Mean length11.587407
Min length7

Characters and Unicode

Total characters85573
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCOMPACT
2nd rowCOMPACT
3rd rowCOMPACT
4th rowSUV - SMALL
5th rowSUV - SMALL

Common Values

ValueCountFrequency (%)
SUV - SMALL1217
16.5%
MID-SIZE1133
15.3%
COMPACT1022
13.8%
SUV - STANDARD735
10.0%
FULL-SIZE639
8.7%
SUBCOMPACT606
8.2%
PICKUP TRUCK - STANDARD538
7.3%
TWO-SEATER460
 
6.2%
MINICOMPACT326
 
4.4%
STATION WAGON - SMALL252
 
3.4%
Other values (6)457
 
6.2%

Length

2026-01-09T18:09:00.269787image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3042
20.8%
suv1952
13.3%
small1628
11.1%
standard1273
8.7%
mid-size1186
 
8.1%
compact1022
 
7.0%
pickup697
 
4.8%
truck697
 
4.8%
full-size639
 
4.4%
subcompact606
 
4.1%
Other values (11)1883
12.9%

Most occurring characters

ValueCountFrequency (%)
S8335
 
9.7%
A7531
 
8.8%
7240
 
8.5%
C5478
 
6.4%
T5454
 
6.4%
-5327
 
6.2%
M5174
 
6.0%
I4979
 
5.8%
L4688
 
5.5%
U4668
 
5.5%
Other values (14)26699
31.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)85573
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S8335
 
9.7%
A7531
 
8.8%
7240
 
8.5%
C5478
 
6.4%
T5454
 
6.4%
-5327
 
6.2%
M5174
 
6.0%
I4979
 
5.8%
L4688
 
5.5%
U4668
 
5.5%
Other values (14)26699
31.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)85573
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S8335
 
9.7%
A7531
 
8.8%
7240
 
8.5%
C5478
 
6.4%
T5454
 
6.4%
-5327
 
6.2%
M5174
 
6.0%
I4979
 
5.8%
L4688
 
5.5%
U4668
 
5.5%
Other values (14)26699
31.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)85573
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S8335
 
9.7%
A7531
 
8.8%
7240
 
8.5%
C5478
 
6.4%
T5454
 
6.4%
-5327
 
6.2%
M5174
 
6.0%
I4979
 
5.8%
L4688
 
5.5%
U4668
 
5.5%
Other values (14)26699
31.2%

Engine Size(L)
Real number (ℝ)

High correlation 

Engine Size (L) is a continuous variable ranging from 0.9 to 8.4 liters. The distribution is heavily concentrated at 2.0 L

Distinct51
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1600677
Minimum0.9
Maximum8.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size57.8 KiB
2026-01-09T18:09:00.354768image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.9
5-th percentile1.5
Q12
median3
Q33.7
95-th percentile6
Maximum8.4
Range7.5
Interquartile range (IQR)1.7

Descriptive statistics

Standard deviation1.3541705
Coefficient of variation (CV)0.42852577
Kurtosis-0.13196328
Mean3.1600677
Median Absolute Deviation (MAD)1
Skewness0.80918099
Sum23337.1
Variance1.8337776
MonotonicityNot monotonic
2026-01-09T18:09:00.456786image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21460
19.8%
3804
 
10.9%
3.6536
 
7.3%
3.5529
 
7.2%
2.5423
 
5.7%
2.4346
 
4.7%
1.6302
 
4.1%
5.3290
 
3.9%
1.8216
 
2.9%
1.4211
 
2.9%
Other values (41)2268
30.7%
ValueCountFrequency (%)
0.93
 
< 0.1%
118
 
0.2%
1.225
 
0.3%
1.311
 
0.1%
1.4211
 
2.9%
1.5207
 
2.8%
1.6302
 
4.1%
1.8216
 
2.9%
21460
19.8%
2.15
 
0.1%
ValueCountFrequency (%)
8.45
 
0.1%
83
 
< 0.1%
6.88
 
0.1%
6.725
 
0.3%
6.629
 
0.4%
6.518
 
0.2%
6.446
 
0.6%
6.33
 
< 0.1%
6.2162
2.2%
694
1.3%

Cylinders
Categorical

High correlation 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.7 KiB
4
3220 
6
2446 
8
1402 
12
 
151
3
 
95
Other values (3)
 
71

Length

Max length2
Median length1
Mean length1.0265403
Min length1

Characters and Unicode

Total characters7581
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row6
5th row6

Common Values

ValueCountFrequency (%)
43220
43.6%
62446
33.1%
81402
19.0%
12151
 
2.0%
395
 
1.3%
1042
 
0.6%
526
 
0.4%
163
 
< 0.1%

Length

2026-01-09T18:09:00.545876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-09T18:09:00.615300image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
43220
43.6%
62446
33.1%
81402
19.0%
12151
 
2.0%
395
 
1.3%
1042
 
0.6%
526
 
0.4%
163
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
43220
42.5%
62449
32.3%
81402
18.5%
1196
 
2.6%
2151
 
2.0%
395
 
1.3%
042
 
0.6%
526
 
0.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)7581
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
43220
42.5%
62449
32.3%
81402
18.5%
1196
 
2.6%
2151
 
2.0%
395
 
1.3%
042
 
0.6%
526
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7581
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
43220
42.5%
62449
32.3%
81402
18.5%
1196
 
2.6%
2151
 
2.0%
395
 
1.3%
042
 
0.6%
526
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7581
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
43220
42.5%
62449
32.3%
81402
18.5%
1196
 
2.6%
2151
 
2.0%
395
 
1.3%
042
 
0.6%
526
 
0.3%

Transmission
Categorical

High correlation 

Transmission is a categorical variable with 27 distinct levels, combining transmission type and number of gears. Automatic transmissions with 6 to 8 gears dominate the dataset.

Distinct27
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size429.8 KiB
AS6
1324 
AS8
1211 
M6
901 
A6
789 
A8
490 
Other values (22)
2670 

Length

Max length4
Median length3
Mean length2.5773866
Min length2

Characters and Unicode

Total characters19034
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAS5
2nd rowM6
3rd rowAV7
4th rowAS6
5th rowAS6

Common Values

ValueCountFrequency (%)
AS61324
17.9%
AS81211
16.4%
M6901
12.2%
A6789
10.7%
A8490
 
6.6%
AM7445
 
6.0%
A9339
 
4.6%
AS7319
 
4.3%
AV295
 
4.0%
M5193
 
2.6%
Other values (17)1079
14.6%

Length

2026-01-09T18:09:00.696771image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
as61324
17.9%
as81211
16.4%
m6901
12.2%
a6789
10.7%
a8490
 
6.6%
am7445
 
6.0%
a9339
 
4.6%
as7319
 
4.3%
av295
 
4.0%
m5193
 
2.6%
Other values (17)1079
14.6%

Most occurring characters

ValueCountFrequency (%)
A6200
32.6%
63259
17.1%
S3127
16.4%
M1831
 
9.6%
81802
 
9.5%
71026
 
5.4%
V576
 
3.0%
9419
 
2.2%
5307
 
1.6%
1210
 
1.1%
Other values (2)277
 
1.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)19034
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A6200
32.6%
63259
17.1%
S3127
16.4%
M1831
 
9.6%
81802
 
9.5%
71026
 
5.4%
V576
 
3.0%
9419
 
2.2%
5307
 
1.6%
1210
 
1.1%
Other values (2)277
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)19034
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A6200
32.6%
63259
17.1%
S3127
16.4%
M1831
 
9.6%
81802
 
9.5%
71026
 
5.4%
V576
 
3.0%
9419
 
2.2%
5307
 
1.6%
1210
 
1.1%
Other values (2)277
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)19034
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A6200
32.6%
63259
17.1%
S3127
16.4%
M1831
 
9.6%
81802
 
9.5%
71026
 
5.4%
V576
 
3.0%
9419
 
2.2%
5307
 
1.6%
1210
 
1.1%
Other values (2)277
 
1.5%

Fuel Type
Categorical

High correlation 

Fuel Type is a categorical variable with five levels, dominated by regular (X) and premium (Z) gasoline, while alternative fuels (E85, diesel, natural gas) are relatively rare.

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size418.4 KiB
X
3637 
Z
3202 
E
370 
D
 
175
N
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7385
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowZ
2nd rowZ
3rd rowZ
4th rowZ
5th rowZ

Common Values

ValueCountFrequency (%)
X3637
49.2%
Z3202
43.4%
E370
 
5.0%
D175
 
2.4%
N1
 
< 0.1%

Length

2026-01-09T18:09:00.770504image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-01-09T18:09:00.829848image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
x3637
49.2%
z3202
43.4%
e370
 
5.0%
d175
 
2.4%
n1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
X3637
49.2%
Z3202
43.4%
E370
 
5.0%
D175
 
2.4%
N1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)7385
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
X3637
49.2%
Z3202
43.4%
E370
 
5.0%
D175
 
2.4%
N1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7385
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
X3637
49.2%
Z3202
43.4%
E370
 
5.0%
D175
 
2.4%
N1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7385
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
X3637
49.2%
Z3202
43.4%
E370
 
5.0%
D175
 
2.4%
N1
 
< 0.1%

Fuel Consumption Comb (L/100 km)
Real number (ℝ)

High correlation 

The Fuel Consumption Comb (L/100 km) distribution is positively skewed (Skewness: 0.89), with a mean of 10.98 exceeding the median of 10.6. Most vehicles fall within the 8.9–12.6 range (IQR). The bimodal nature of the plot suggests the dataset contains two distinct vehicle classes with different efficiency profiles.

Distinct181
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.975071
Minimum4.1
Maximum26.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size57.8 KiB
2026-01-09T18:09:00.911118image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum4.1
5-th percentile7.2
Q18.9
median10.6
Q312.6
95-th percentile16.5
Maximum26.1
Range22
Interquartile range (IQR)3.7

Descriptive statistics

Standard deviation2.8925063
Coefficient of variation (CV)0.2635524
Kurtosis1.3935754
Mean10.975071
Median Absolute Deviation (MAD)1.8
Skewness0.89331572
Sum81050.9
Variance8.3665927
MonotonicityNot monotonic
2026-01-09T18:09:01.012423image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.4145
 
2.0%
8.4136
 
1.8%
9.8135
 
1.8%
9.1132
 
1.8%
10.3130
 
1.8%
8.7128
 
1.7%
11127
 
1.7%
9.9125
 
1.7%
10.7124
 
1.7%
9121
 
1.6%
Other values (171)6082
82.4%
ValueCountFrequency (%)
4.14
 
0.1%
4.21
 
< 0.1%
4.32
 
< 0.1%
4.42
 
< 0.1%
4.55
0.1%
4.79
0.1%
4.87
0.1%
4.96
0.1%
55
0.1%
5.112
0.2%
ValueCountFrequency (%)
26.12
< 0.1%
25.92
< 0.1%
25.82
< 0.1%
25.72
< 0.1%
23.91
 
< 0.1%
231
 
< 0.1%
22.64
0.1%
22.51
 
< 0.1%
22.23
< 0.1%
22.12
< 0.1%

CO2 Emissions(g/km)
Real number (ℝ)

High correlation 

Distinct331
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean250.5847
Minimum96
Maximum522
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size57.8 KiB
2026-01-09T18:09:01.118585image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum96
5-th percentile169
Q1208
median246
Q3288
95-th percentile354
Maximum522
Range426
Interquartile range (IQR)80

Descriptive statistics

Standard deviation58.512679
Coefficient of variation (CV)0.2335046
Kurtosis0.47880085
Mean250.5847
Median Absolute Deviation (MAD)40
Skewness0.52609381
Sum1850568
Variance3423.7336
MonotonicityNot monotonic
2026-01-09T18:09:01.215580image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24285
 
1.2%
22182
 
1.1%
21477
 
1.0%
23077
 
1.0%
29476
 
1.0%
23276
 
1.0%
25875
 
1.0%
25375
 
1.0%
24675
 
1.0%
20974
 
1.0%
Other values (321)6613
89.5%
ValueCountFrequency (%)
964
0.1%
991
 
< 0.1%
1021
 
< 0.1%
1031
 
< 0.1%
1042
 
< 0.1%
1053
< 0.1%
1062
 
< 0.1%
1082
 
< 0.1%
1092
 
< 0.1%
1107
0.1%
ValueCountFrequency (%)
5223
< 0.1%
4932
< 0.1%
4881
 
< 0.1%
4871
 
< 0.1%
4851
 
< 0.1%
4761
 
< 0.1%
4731
 
< 0.1%
4671
 
< 0.1%
4653
< 0.1%
4642
< 0.1%

Interactions

2026-01-09T18:08:58.284457image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:56.907794image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:57.693366image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:58.465440image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:57.229363image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:57.890497image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:58.644338image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:57.466116image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-01-09T18:08:58.072613image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2026-01-09T18:09:01.279200image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Engine Size(L)Fuel Consumption Comb (L/100 km)CO2 Emissions(g/km)
Engine Size(L)1.0000.8170.851
Fuel Consumption Comb (L/100 km)0.8171.0000.918
CO2 Emissions(g/km)0.8510.9181.000
2026-01-09T18:09:01.382522image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Engine Size(L)Fuel Consumption Comb (L/100 km)CO2 Emissions(g/km)
Engine Size(L)1.0000.8620.869
Fuel Consumption Comb (L/100 km)0.8621.0000.963
CO2 Emissions(g/km)0.8690.9631.000
2026-01-09T18:09:01.470121image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Engine Size(L)Fuel Consumption Comb (L/100 km)CO2 Emissions(g/km)
Engine Size(L)1.0000.6900.698
Fuel Consumption Comb (L/100 km)0.6901.0000.911
CO2 Emissions(g/km)0.6980.9111.000
2026-01-09T18:09:01.577187image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
MakeVehicle ClassEngine Size(L)CylindersTransmissionFuel TypeFuel Consumption Comb (L/100 km)CO2 Emissions(g/km)
Make1.0000.8040.8390.8920.9010.7350.7060.762
Vehicle Class0.8041.0000.5450.6230.7340.5190.6170.601
Engine Size(L)0.8390.5451.0000.8120.6550.3730.6810.709
Cylinders0.8920.6230.8121.0000.5290.2930.6920.743
Transmission0.9010.7340.6550.5291.0000.6170.6080.597
Fuel Type0.7350.5190.3730.2930.6171.0000.6590.373
Fuel Consumption Comb (L/100 km)0.7060.6170.6810.6920.6080.6591.0000.953
CO2 Emissions(g/km)0.7620.6010.7090.7430.5970.3730.9531.000
2026-01-09T18:09:01.727127image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
CO2 Emissions(g/km)CylindersEngine Size(L)Fuel Consumption Comb (L/100 km)Fuel TypeMakeTransmissionVehicle Class
CO2 Emissions(g/km)1.0000.4770.8690.9630.1640.3810.2620.285
Cylinders0.4771.0000.5870.4230.1840.6060.2420.267
Engine Size(L)0.8690.5871.0000.8620.2420.4990.2800.262
Fuel Consumption Comb (L/100 km)0.9630.4230.8621.0000.3350.3370.2700.297
Fuel Type0.1640.1840.2420.3351.0000.4470.3530.296
Make0.3810.6060.4990.3370.4471.0000.4180.359
Transmission0.2620.2420.2800.2700.3530.4181.0000.309
Vehicle Class0.2850.2670.2620.2970.2960.3590.3091.000

Missing values

2026-01-09T18:08:58.884314image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2026-01-09T18:08:58.987742image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

MakeModelVehicle ClassEngine Size(L)CylindersTransmissionFuel TypeFuel Consumption Comb (L/100 km)CO2 Emissions(g/km)
0ACURAILXCOMPACT2.04AS5Z8.5196
1ACURAILXCOMPACT2.44M6Z9.6221
2ACURAILX HYBRIDCOMPACT1.54AV7Z5.9136
3ACURAMDX 4WDSUV - SMALL3.56AS6Z11.1255
4ACURARDX AWDSUV - SMALL3.56AS6Z10.6244
5ACURARLXMID-SIZE3.56AS6Z10.0230
6ACURATLMID-SIZE3.56AS6Z10.1232
7ACURATL AWDMID-SIZE3.76AS6Z11.1255
8ACURATL AWDMID-SIZE3.76M6Z11.6267
9ACURATSXCOMPACT2.44AS5Z9.2212
MakeModelVehicle ClassEngine Size(L)CylindersTransmissionFuel TypeFuel Consumption Comb (L/100 km)CO2 Emissions(g/km)
7375VOLVOS90 T6 AWDMID-SIZE2.04AS8Z9.6223
7376VOLVOV60 T5STATION WAGON - SMALL2.04AS8Z8.9208
7377VOLVOV60 T6 AWDSTATION WAGON - SMALL2.04AS8Z9.4219
7378VOLVOV60 CC T5 AWDSTATION WAGON - SMALL2.04AS8Z9.4220
7379VOLVOXC40 T4 AWDSUV - SMALL2.04AS8X9.0210
7380VOLVOXC40 T5 AWDSUV - SMALL2.04AS8Z9.4219
7381VOLVOXC60 T5 AWDSUV - SMALL2.04AS8Z9.9232
7382VOLVOXC60 T6 AWDSUV - SMALL2.04AS8Z10.3240
7383VOLVOXC90 T5 AWDSUV - STANDARD2.04AS8Z9.9232
7384VOLVOXC90 T6 AWDSUV - STANDARD2.04AS8Z10.7248

Duplicate rows

Most frequently occurring

MakeModelVehicle ClassEngine Size(L)CylindersTransmissionFuel TypeFuel Consumption Comb (L/100 km)CO2 Emissions(g/km)# duplicates
686LEXUSGS FCOMPACT5.08AS8Z12.52935
227CHRYSLER300FULL-SIZE3.66A8X10.32424
231CHRYSLER300 AWDFULL-SIZE3.66A8X11.02584
317FIAT500LSTATION WAGON - SMALL1.44A6X9.42214
531INFINITIQX60 AWDSUV - SMALL3.56AV7Z10.92574
707LEXUSNX 300h AWDSUV - SMALL2.54AV6X7.51764
710LEXUSRC FSUBCOMPACT5.08AS8Z12.62894
712LEXUSRX 350 AWDSUV - SMALL3.56AS8X10.82524
716LEXUSRX 450h AWDSUV - STANDARD3.56AV6Z7.91854
894MITSUBISHIRVR 4WDSUV - SMALL2.04AV6X9.22134